General time-reversible distances with unequal rates across sites: mixing gamma and inverse Gaussian distributions with invariant sites.
نویسندگان
چکیده
A series of new results useful to the study of DNA sequences using Markov models of substitution are presented with proofs. General time-reversible distances can be extended to accommodate any fixed distribution of rates across sites by replacing the logarithmic function of a matrix with the inverse of a moment generating function. Estimators are presented assuming a gamma distribution, the inverse Gaussian distribution, or a mixture of either of these with invariant sites. Also considered are the different ways invariant sites may be removed and how these differences may affect estimated distances. Through collaboration, we implemented these distances into PAUP in 1994. The variance of these new distances is approximated via the delta method. It is also shown how to predict the divergence expected for a pair of sequences given a rate matrix and a distribution of rates across sites, allowing iterated ML estimates of distances under any reversible model. A simple test of whether a rate matrix is time reversible is also presented. These new methods are used to estimate the divergence time of humans and chimps from mtDNA sequence data. These analyses support suggestions that the human lineage has an enhanced transition rate relative to other hominoids. These studies also show that transversion distances differ substantially from the overall distances which are dominated by transitions. Transversions alone apparently suggest a very recent divergence time for humans versus chimps and/or a very old (> 16 myr) divergence time for humans versus orangutans. This work illustrates graphically ways to interpret the reliability of distance-based transformations, using the corrected transition to transversion ratio returned for pairs of sequences which are successively more diverged.
منابع مشابه
Is the general time-reversible model bad for molecular phylogenetics?
The general time-reversible (GTR) model (Tavaré, 1986) has been the workhorse of molecular phylogenetics for the last decade. GTR sits at the top of the ModelTest hierarchy of models (Posada & Crandall, 1998) and, usually with the addition of invariant sites and a gamma distribution of rates across sites, is currently by far the most commonly selected model for phylogenetic inference (see Table...
متن کاملAnalytic Solutions for Three-Taxon MLMC Trees with Variable Rates Across Sites
We consider the problem of finding the maximum likelihood rooted tree under a molecular clock (MLMC), with three species and 2-state characters under a symmetric model of substitution. For identically distributed rates per site this is probably the simplest phylogenetic estimation problem, and it is readily solved numerically. Analytic solutions, on the other hand, were obtained only recently (...
متن کاملTruncated Linear Minimax Estimator of a Power of the Scale Parameter in a Lower- Bounded Parameter Space
Minimax estimation problems with restricted parameter space reached increasing interest within the last two decades Some authors derived minimax and admissible estimators of bounded parameters under squared error loss and scale invariant squared error loss In some truncated estimation problems the most natural estimator to be considered is the truncated version of a classic...
متن کاملMaximum likelihood estimation of phylogenetic trees is consistent when substitution rates vary according to the invariable sites plus gamma distribution.
Maximum likelihood estimation of phylogenetic trees from nucleotide sequences is completely consistent when nucleotide substitution is governed by the general time reversible (GTR) model with rates that vary over sites according to the invariable sites plus gamma (I + gamma) distribution.
متن کاملBayesian Estimation of Shift Point in Shape Parameter of Inverse Gaussian Distribution Under Different Loss Functions
In this paper, a Bayesian approach is proposed for shift point detection in an inverse Gaussian distribution. In this study, the mean parameter of inverse Gaussian distribution is assumed to be constant and shift points in shape parameter is considered. First the posterior distribution of shape parameter is obtained. Then the Bayes estimators are derived under a class of priors and using variou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Molecular phylogenetics and evolution
دوره 8 3 شماره
صفحات -
تاریخ انتشار 1997